Method for modulating gene expression by modifying the cpg content

ABSTRACT

The invention relates to nucleic acid modifications for a directed expression modulation by the targeted insertion or removal of CpG dinucleotides. The invention also relates to modified nucleic acids and expression vectors.

The present invention relates to modified polynucleotides that arederived from naturally occurring and synthetic genes or other codingsequences, and that have a reduced or increased number of CpGdinucleotides in the coding region compared to the original sequence.These polynucleotides may be used in order to investigate, increase orreduce the gene expression and, in a special case, to improve theproduction of biomolecules, the efficiency of DNA vaccines or genetherapy constructs, as well as the quality of transgenic animals orplants.

BACKGROUND OF THE INVENTION

The provision of biomolecules in the form of peptides, proteins or RNAmolecules is an important component in the biotechnology andpharmaceutical sector. Proteins and RNAs produced by recombinanttechnology or expressed in vivo are used to investigate basic mechanismsand relationships, as well as in the production of biotechnologyreagents, in the production of transgenic animals or plants, or formedical applications in the development of treatments and vaccines.Depending on the application, the level of expression of correspondingmolecules should be able to be regulated.

In most cases increases above the standard production levels aredesired. Each expression system or vector construct has limitations,which determine the actual production output. The present inventionrelates to methods and applications that are able to modulate the levelof expression of arbitrary genes in eukaryotic cells. In particular themethod is suitable for modulating arbitrary genes so that the achievablegene expression is above the level that can be achieved with hithertoknown methods for increasing the expression.

PRIOR ART

CpG dinucleotides occupy an important position in the genome ofeukaryotes. They are not randomly distributed like other dinucleotides,but instead are under-represented over wide stretches of the genome. Inaddition CpG dinucleotides in these regions are generally methylated.

An exception to this are regions that have a very much higher density ofCpG dinucleotides, and which on account of these properties are termedCpG islands. A characteristic property of these CpG islands and afurther difference with respect to the CpG dinucleotides is the factthat the CpG dinucleotides in the islands are as a rule not present inmethylated form.

The under-representation of CpG dinucleotides is explained by a chemicalmodification of the corresponding nucleotides. In the genome ofvertebrates about 60-90% of the cytosines in CpG dinucleotides arepresent in methylated form and these methylated cytosines are oftenmodified by deamination to thymines (Shen et al., 1994).

As a result of this process the frequency of cytosines and guanosines isbelow the expected statistical distribution, and is about 40%, and theproportion of CpG dinucleotides is even only about 20% of the frequencyto be expected (Bird, 1980; Sved et al., 1990; Takai et al., 2002).

CpG islands form an exception to this unusual distribution of CpGdinucleotides (Antequera et al., 1993). CpG islands are mostly locatedin the vicinity of promoters, and may extend into the transcribed regionor even lie within exons.

They are characterised by an approximately ten-times higher CpGfrequency (ca. 60-70% C+G content) compared to average gene regions, andare characterised especially by the fact that as a rule they containnon-methylated CpGs (Wise et al., 1999). About 60% of all human genes,especially all housekeeping genes and approximately half oftissue-specific genes are associated with CpG islands (Antequera et al.,1993; Larsen et al., 1992). CpG islands have been described and definedinter alia in the publications by Gardiner-Garden M. & Frommer M (1997)J. Mol. Bio. 196, 261-282 and Takai D. & Jones P. A. (2002) PNAS 99,3740-3745. Since various definitions exist in the prior art, for thepurposes of the present invention a CpG island is defined as follows: aCpG island comprises a sequence of at least 500 successive base pairswith a CpG content of at least 55% and a quotient of (actualCpG/expected CpG) of at least 0.65, and it is associated with a promoter(overlapped wholely or partly by a promoter).

This unequal distribution and modification of CpG dinucleotides, i)under-represented and methylated on the one hand, and ii) concentratedand unmethylated in islands on the other hand, has an important controlfunction in the regulation of the gene expression (illustrateddiagrammatically in FIG. 1).

CpG dinucleotides are involved in the regulation of the gene expressionin early developmental stages, in connection with cell differentiation,genetic imprinting and other procedures. A large number of studies hasshown that in eukaryotes, the methylation of 5′CpG3′ dinucleotides(mCpG) has a repressing effect on the gene expression in vertebrates andflowering plants (Hsieh, 1994; Kudo, 1998; Jones et al., 1998; Deng etal., 2001; Hisano et al., 2003 Li et al., 2004) (FIG. 1A).

Also, in tumour research there are numerous data that prove that, i) theswitching off of the expression of certain genes, often suppressorgenes, is caused by a hypermethylation of CpGs (Li et al., 2004; Kang etal., 2004; Ivanova et al., 2004; Wu et al., 2003), but also that, ii)the uncontrolled expression of other genes is associated with ahypomethylation (Akiyama et al., 2003; Yoshida et al., 2003).

The process of gene switching off by methylation is explained by acascade of events which finally lead to a change of the chromatinstructure, which creates a transcription-weak state. The methylation of5′-CpG-3′ dinucleotides within genes generates a potential binding sitefor protein complexes (primarily from the family of MeCP(methyl-CpG-binding proteins) and MBD (methyl-CpG binding domainprotein) proteins), which bind methylated DNA sequences and at the sametime associate with histone deacetylases (MBD-HDAC) and transcriptionalredressor proteins (Jones et al., 1998; Nan et al., 1998; Hendrich etal., 1998). These complexes involve as a rule a restructuring of thechromatin, which leads to a switching off of the transcription activity(Wade et al., 1999). The methylation in promoter regions may also leaddirectly to a switching off of the gene expression, by preventing thebinding of essential transcription factors due to the introduced methylgroups (Deng et al., 2001).

The above described deregulation of the expression in tumour cells isgenerally connected with an alteration of the methylation state in theabove described CpG islands. In normal cells actively expressed genesare mostly associated with CpG islands, which are not, or are onlyslightly, methylated (FIG. 1B). The methylation of the CpG dinucleotidesin these islands lead to a switching off of the expression of thesegenes (often tumour suppressor genes or cell cycle regulator genes)(FIG. 1C), and as a result leads to an uncontrolled multiplication ofthese cells. Conversely, genes that are inactive due to a methylation ofthe CpG dinucleotides in CpG islands are activated by a demethylation.

The aforedescribed demethylation in CpG islands leads, through analteration of the chromatin structure, to a transcription-active stateanalogous to gene switching off in the case of a methylation. Inaddition to structural alterations there may also be an activation ofthe expression due to activator proteins. The human CpG-binding protein(hCGBP) is such a cellular activator protein. HCGBP binds specificallyto non-methylated CpG dinucleotides in the region of promoters, where asa transactivator it leads to an increase in transcription (Voo et al.,2000).

Hitherto the knowledge that a methylation of the CpG sequences within agene regulates the transcription downwards has been used to prevent theexpression of a gene that is either over-expressed, or whose expressionis undesired, by methylation (Choi et al., 2004; Yao et al., 2003) (cf.FIG. 1A).

A further application of this knowledge is the targeted elimination ofsuch CpG dinucleotides in order to improve gene expression(Chevalier-Mariette et al., 2003). Due to an elimination a methylationand, associated therewith, a change of the chromatin structure to atranscription-inactive state, is likewise prevented (FIG. 1D). In thispublication there is investigated the expression of a transgene withvarious CpG dinucleotide contents in operative coupling with a promoterthat is located within a CpG island, in germ line cells and the embryosof transgenic mice formed therefrom. In this special case atranscriptional switching off of a reporter gene was prevented by theelimination of CpG dinucleotides (FIG. 1D transgene without CpG), as isotherwise to be expected by a de novo methylation of existing CpGdinucleotides during embryo development (FIG. 1D, transgene CpG high). Amore detailed investigation of the mechanism in the publication byChevalier-Mariette showed that the prevention of the gene expression isconnected with a methylation of the intragenic CpG dinucleotides, aswell as and especially with a subsequently occurring methylation of thepromoter-associated CpG islands (FIG. 1D, transgene CpG high). For areporter gene that did not have these intragenic CpG dinucleotides andthat was not efficiently expressed, it was shown that the CpG island wasnot methylated (FIG. 1D, transgene without CpG). The authors thereforeconcluded that, for a lasting in vivo expression, the CpG dinucleotidecontent must be reduced in the immediate vicinity of the promoter and ofthe CpG island.

An increase in gene expression may similarly be achieved by theintegration of complete CpG islands 5′ of a promoter in correspondingvector constructs (WO 02081677) (cf. FIG. 1B). In the identification ofhCGBP, CpG dinucleotides were likewise integrated into the correspondingpromoter region of a reporter gene and an increase in reporter activitywas found. In these transient cell culture tests however the hCGBP waslikewise over-expressed and was therefore present in non-physiologicalraised concentrations (Voo, et al., 2000).

It is already known that the C/G content has an influence on the mRNAstability. Thus, for example, Duan and Antezana (2003) show that theexpression of three different variants of a human gene in CHO cellsconsequently leads to differences in the mRNA concentration. In thefirst variant the human gene sequence had been altered so that thenumber of C/G dinucleotides was maximised. In a second variant on theother hand, the number of T/A dinucleotides had been maximised. Thedifferences in the steady-state level, i.e. in the amount of mRNA, couldbe attributed experimentally to differences in the breakdown of themRNA. On account of a stabilisation of the secondary structure with araised C/G content, corresponding mRNAs were less strongly broken downthan wild type, and correspondingly T/A-rich mRNAs were broken down to amuch higher degree than wild type. An analysis was not carried out atthe protein level, and also no increase in protein production due to anincrease of the CpG dinucleotides was to be expected, since astabilisation of the secondary structure of the mRNA negativelyinfluences the translation.

Deml et al. disclose a sequence of the HIV-I Gag gene forcodon-optimised expression in mammalian cells. A specific increase ofCpG dinucleotides is not disclosed.

The object of the invention was to provide a method for the targetedmodulation of the gene expression that at least partly avoids thedisadvantages of the prior art.

This object is achieved by a method for the targeted modulation of thegene expression, comprising the following steps:

-   i. Provision of a target nucleic acid sequence to be expressed,-   ii. Modification of the target nucleic acid sequence, in which the    number of CpG dinucleotides present in the target nucleic acid    sequence is raised using the degeneracy of the genetic code to    increase the gene expression, or is lowered to reduce the gene    expression,-   iii. Cloning of the thereby modified target nucleic acid sequence    with a modified number of the CpG dinucleotides in a suitable    expression vector in operative coupling with a suitable    transcription control sequence,-   iv. Expression of the modified target nucleic acid sequence in a    suitable expression system.

It was extremely surprisingly found that, when using the method of thepresent invention, exactly the opposite effect can be achieved thanwould have been expected according to a knowledge of the prior art. Thismeans that, with the method of the present invention, by raising thenumber of CpG dinucleotides in a target nucleic acid sequence theexpression of this target nucleic acid sequence can be raised, whereasby reducing the number of CpG dinucleotides in the target nucleic acidsequence its expression is prevented. The increase in the number of CpGdinucleotides in the reading frame should, according to the invention,not be equated to the introduction of a CpG island. The increase of theCpG dinucleotides in the reading frame differs by definition from a CpGisland, due to i) a possible lower base number (<500) and ii) theabsence of an overlapping with the promoter region.

The expression system may on the one hand be a cell, or on the otherhand a cell-free system or in vitro system. A prokaryotic or eukaryoticexpression system may be used, though a eukaryotic expression system ispreferably employed. Suitable expression systems include e.g. bacterialcells, insect cells, e.g. Baculovirus expression systems, SF9 cells,Drosophila-Schneider cells, plant cells, yeasts, e.g. SaccharomycesCerevisiae, Pichia angusta, Pichia pastoris and the like; as well asalso algae, e.g. Chlamydomonas. Examples of possible plant expressionsystems include Arabidopsis thaliana, Zea mays (corn), Nicotiana tobacco(tobacco), Oryza sativa (rice), Hordeum vulgare (barley), Glicine max(soya), Brassica sp. (cabbage) and the like.

Preferably vertebrate cells are used, in particular mammalian cells,especially human cells, in particular somatic cells and no germ linecells. Particularly preferably the expression system is a system or acell with a low level of methylation, i.e. substantially no de novomethylation takes place. On the other hand it is also possible to usethis method for the production of transgenic non-human organisms, inparticular plants and animals.

The present invention thus relates in particular to a method for thetargeted alteration of the level of expression of a transcript and/orfor the targeted alteration of protein production, in particular ineukaryotic cells. The method is characterised by modifications of thereading frame of a DNA sequence to be transcribed.

The modifications relate to a variation of the proportion of CpGdinucleotides, which correlate with a change of the level of expression.

The technology of artificial gene synthesis enables any arbitrarynucleotide sequence chosen from these possibilities to be synthesised.By varying motifs within the coding region of a gene which correlatewith the level of expression, the protein production can with thistechnology be modulated in a targeted manner by the choice of thecorresponding nucleotide sequence. Within the scope of the presentinvention CpG dinucleotides were identified as such a motif having adirect influence on the level of expression.

It was surprisingly found that, in contrast to the generally acceptedopinion, the introduction of CpG dinucleotides in the way and manneraccording to the invention leads to an increase of the gene expressioninstead of to a reduction of the expression. Conversely, the eliminationof CpGs leads to a reduction of the gene expression.

The term “gene expression” within the context of the present inventionincludes both transcription as well as translation, and in particularthis term is understood to include protein production.

These changes at the nucleic acid level are introduced within the scopeof the present invention preferably by the production of an artificialgene by de novo gene synthesis, in which the amino acid sequence forwhich the corresponding gene codes preferably remains unchanged.

De novo gene synthesis methods are known to the person skilled in theart in this field. The alteration of the CpG content is preferablycarried out by silent mutations or by mutations that do not destroy theactivity of the gene product. The modified target nucleic acid sequencesmay, as stated in the example, be produced for example from longoligonucleotides by a stepwise PCR or, in the case of conventional genesynthesis, may be ordered from specialist suppliers (e.g. Geneart GmbH,Qiagen AG).

Surprisingly, the expression of the corresponding gene can be negativelyinfluenced (smaller number of CpG) or positively influenced (increasednumber of CpG) by suitably choosing the number of CpG dinucleotides, andmay even exceed the expression rates that can be achieved with acodon-optimised gene. The expression may unexpectedly even be raised ifthe increase in the number of CpG dinucleotides takes place at theexpense of the RNA and codon optimisation. Preferably no CpG islands areintroduced in the modification of the target nucleic acid sequence, andpreferably the modified target nucleic acid sequence is not associatedwith CpG islands. By way of delimitation as regards the defined CpGislands, whose influence on the expression operates according to theall-or-none principle, in the present invention a correlation is foundbetween the level of expression and the number of CpG dinucleotides.

For the expression of genes these modifications are preferablyintroduced so that the coded amino acid sequence is not altered. In theideal case only the nucleic acid sequence of a corresponding gene shouldinfluence its level of expression. Since the genetic code is degenerate,there is the possibility, for a specific amino acid sequence, ofchoosing a plurality of corresponding nucleic acid sequences.

By way of delimitation as regards the hitherto described methods, 1) theregion coding for the transcript should be modified, whereby this methodcan be used independently of vectors and other gene technologyconditions, and 2) for an increase in the level of expression the numberof CpG dinucleotides should be raised. The additionally introduced CpGsare in this connection not methylated.

Preferably the number of CpG dinucleotides compared to the sequence ofthe target nucleic acid sequence to be expressed is increased orreduced, depending on the desired level of expression, by at least 2,preferably at least 3, more preferably at least 5, still more preferablyat least 8, yet more preferably at least 10, even more preferably by atleast 15, and up to 20 or more, especially by 30-50 or even up to 100 ormore, depending on the length of the target nucleic acid sequence to beexpressed.

Preferably the number of CpG dinucleotides compared to the sequence ofthe target nucleic acid to be expressed is raised by at least 10%,preferably at least 20%, more preferably at least 50%, particularlypreferably at least 100%, and especially at least 200%, or by a factorof 5 or 10.

If CpGs are eliminated, it is preferred to eliminate all CpGs that canbe eliminated within the scope of the genetic code. However, fewer CpGscan also be eliminated, for example 10%, 50% or 75%, in which case theelimination again depends on the desired level of expression.

Within the scope of the present invention it has surprisingly been foundthat increasing or reducing the number of CpG dinucleotides permits astepwise modulation of the gene expression. A dose effect wassurprisingly observed. This means that the level of gene expression canbe adjusted by the addition or elimination of more or fewer CpGdinucleotides.

As already mentioned, it is possible and preferred to make use of thedegeneracy of the genetic code so that preferably the maximum number ofCpG dinucleotides is introduced or eliminated without having to alterthe amino acid sequence of the target nucleic acid sequence to beexpressed. The maximum number of CpG dinucleotides to be introduced ispreferably limited by the variation possibilities of the degeneratedcodon of a predetermined amino acid sequence.

On the other hand, if desired the number of CpG dinucleotides may beincreased still further, even if the corresponding amino acid sequenceis thereby altered. In this case care should be taken to ensure that thefunction of the peptide or protein is not interfered with.

The CpG dinucleotides may, depending on the type of degeneracy of thegenetic code, be removed or added within a codon or also overlapping acodon.

In addition to the change in the number of CpG dinucleotides in thetarget nucleic acid to be expressed, the latter may be changed furtherat the nucleic acid level depending on the desired degree of geneexpression.

If for example an increase in gene expression is aimed for, then thenumber of CpG dinucleotides is preferably raised in such a way that, dueto the introduction of further CpG dinucleotides, no disadvantageouseffects occur, such as for example more strongly expressed secondarystructures of the mRNA, which could have a disadvantageous effect on thetranslation, or further motifs that could negatively influence theexpression, e.g. RNA instability motifs, splice-activating motifs,endonuclease recognition sites, and the like. On the other hand it is ofcourse also possible, if the number of CpG dinucleotides is decreased inorder to reduce the gene expression, to eliminate the CpG dinucleotidesspecifically at those sites which, after alteration of the nucleic acidsequence, lead to specifically these motifs.

Again, it is of course also possible and also preferred, in addition toincreasing or reducing the number of CpG dinucleotides, moreover tocarry out a nucleic acid optimisation, so that either the geneexpression is promoted or inhibited, or is reduced.

Such optimisations are accordingly the insertion or removal of motifsthat can influence the gene expression, for example secondarystructure-stabilising sequences, regions with raised self-homology,regions with raised homology with respect to the natural gene, RNAinstability motifs, splice-activating motifs, polyadenylation motifs,adenine-rich sequence steps, endonuclease recognition sites and thelike. Yet a further possible way of optimisation consists in optimisingin each case the codon choice for the desired expression system.

This means that, within the scope of the present invention, theexpression may also be raised or reduced if, in addition to theinsertion of CpG dinucleotides, the codon choice is optimised or madeworse. Expression-optimised constructs according to the invention can beproduced for example by choosing the codon distribution to be the sameas in the expression system that is used.

Preferably the eukaryotic expression system is a mammalian system,preferably a human system. Preferably therefore the codon optimisationis matched to the codon choice of human genes. Preferably in thisconnection, a codon choice is used that is most frequently or next tomost frequently employed in mammalian cells (Ausubel et al., 1994), inorder to ensure a general stabilisation of the RNA and an optimal codonchoice. Still more preferably, the nucleic acid sequence is modified foran optimal expression by using the gene optimiser technology (DE 102 60805.9 or PCT/EP03/14850).

In contrast to the codon optimisation, “poor” codons seldom used by theexpression system may however also be employed in order to increase thenumber of CpG dinucleotides.

In the method according to the invention a heterologous target nucleicacid sequence may also be used. The expression “heterologous targetnucleic acid sequence” refers to the origin of the target nucleic acidsequence and to the origin of the expression system. Preferablytherefore the target nucleic acid sequence and the expression system areheterologous to one another, i.e. they are derived either from differentspecies and/or the codon choice of the wild-type target nucleic acidsequence is a different sequence to that of the expression system. Theterm “heterologous” within the context of the invention thus alsoincludes differences with respect to the codon choice. The codon choicedenotes the preferred codon usage for a respective species, within thescope of the degeneracy of the genetic code.

As expression vector there may be used any suitable expression vector.Such a vector is preferably suitable for expression in eukaryotic cells.The modified target nucleic acid sequence to be expressed is cloned intothe vector so that it is in operative coupling with a suitabletranscription control sequence and possibly further regulator elements.A suitable promoter, which may either be constitutive or inducible, maybe such a transcription control sequence.

Constitutively active promoters are preferably selected from, but notrestricted to, CMV (Cytomegalovirus) promoter and Simian Virus 40(SV40). Inducible promoters include, but however are not restricted to,tetracyclin-dependent promoters. The person skilled in the art iscapable without any difficulty of selecting further suitable promotersdepending on the application, e.g. also promoters of cellular origin.

In this connection, in principle any inducible promoter system that isknown in the prior art is suitable. For example, a natural or artificialinducible promoter may be used, for example a promoter inducible bytetracyclin (Tet on/Tet off system). Furthermore, an inducible viralpromoter may however also be used.

Preferably the inducible promoter can be induced by a transactivefactor. A viral inducible promoter which can be induced by a viraltransactive factor may be derived from an arbitrary virus. Sequences ofretroviruses, HCV (Hepatitis C virus), HBV (Hepatitis B virus), HSV(Herpes Simplex virus), EBV (Epstein-Barr virus), SV 40 (Simian virus40), AAV (Adeno-associated virus), Adenovirus, Papilloma viruses orEbola virus are preferably used for this purpose. The transactivefactors used in this connection are accordingly selected for examplefrom the following viral factors, but are not restricted to these: NS5A(HCV), HB X (HBV), VP16/ICP4 (EBV), EBNA1/Rta (EBV), ART (HHV8), LargeT-Antigen (SV40), Rep78/68 (AAV), E1A (Adenovirus), E2 (Papilloma virus)and VP30 (Ebola virus).

As inducible promoter that can be induced by a viral transactive factor,there is preferably used a retroviral LTR promoter or a functionalpartial sequence thereof. Preferably therefore the transactive factor isa retroviral Tat or Tax protein. The LTR promoter may be selected fromthe LTRs of HIV-1, HIV-2, SIV, HTLV and other related retroviruses thathave LTR promoters. In particular lentiviral promoters are preferred,especially those of HIV.

Preferably the transcription control sequences, i.e. for examplepromoters and/or enhancers, etc., used within the scope of the presentinvention are not associated with CpG islands.

It is also possible, in addition to increasing the number of CpGdinucleotides in the target nucleic acid to be expressed, to reduce thenumber of CpG dinucleotides in the remaining sequences or parts thereofpresent on the vector. In this connection the CpG dinucleotides in theseremaining vector sequences or parts thereof may be completelyeliminated. Preferably this is again carried out while retaining theamino acid sequence by utilising the degeneracy of the genetic code.Also, only a partial elimination of the CpG dinucleotides in thesesequences may take place, for example of at least 5%, preferably atleast 10%, more preferably at least 15%, particularly preferably atleast 25%, more particularly preferably 50%, and most particularlypreferably 75% or more. Preferably all CpGs are removed insofar as thisis possible.

Thus, depending on the application (silencing or increasing theexpression) the number of CpG dinucleotides may be varied independentlyof the chosen codon optimisation.

In most cases a complete elimination of CpGs from the reading frame ispossible. The coded amino acid sequence is upwardly limiting, i.e. asregards increasing the number of CpGs.

The target nucleic acid sequence may code for an RNA, derivatives ormimetics thereof, a peptide or polypeptide, a modified peptide orpolypeptide, a protein or a modified protein.

The target nucleic acid sequence may also be a chimera and/or assembledsequence of different wild-type sequences, and for example it may codefor a fusion protein or mosaic-like constructed polygene constructs.

The target nucleic acid sequence may also code for a synthetic sequence.In this connection it is also possible to model the nucleic acidsequence synthetically, for example with the aid of a computer model.

The target nucleic acid sequence to be expressed may preferably be asequence for a gene for an arbitrary protein, for example a recombinantprotein, an artificial polypeptide, a fusion protein and the like.Diagnostic and/or therapeutic peptides, polypeptides and proteins arepreferred. The peptide/protein may for example be used for i) theproduction of therapeutic products, such as e.g. human enzymes (e.g.asparaginase, adenosine deaminase, insulin, tPA, clotting factors,vitamin K epoxide reductase), hormones (e.g. erythropoietin,follicle-stimulating hormone, oestrogens) and other proteins of humanorigin (e.g. bone morphogenic proteins, antithrombin), ii) viral,bacterial proteins or proteins derived from parasites, which may be usedas vaccines (derived from HIV, HBV, HCV, influenza, Borrelia,Haemophilus, meningococcus, anthrax, botulin toxin, diphtheria toxin,tetanus toxin, Plasmodium, etc.) or iii) proteins that may be used forthe production of diagnostic test systems (e.g. blood group antigens,HLA proteins).

As a further possibility, a gene may be chosen that produces messengersubstances (cytokines/chemokines), e.g. G-CSF, GM-CSF, interleukins,interferons, PDGF, TNF, RANTES or MIP1α or domains, fragments orvariants thereof, which are capable of actuating the natural defensemechanisms of adjacent cells or, in combination with suitable antigens,of amplifying a specific immune response.

A further possible use is in the production of proteins, such as forexample enzymes (polymerases, proteases, etc.) for biotechnologyapplications.

The target nucleic acid to be expressed may also be a regulator gene,which after its expression in a cell as a molecular switch moleculeswitches the expression of other genes on or off. As such a regulatorgene there may for example be used a component of a signal transductionpathway or a transcription factor. The term “expression” includes inthis connection the transcription of the target nucleic acids andpossibly the translation of the RNA obtained by transcription.

Finally, the target nucleic acid to be expressed may be a functional RNA(e.g. ribozyme, decoy or siRNA), which may preferably be used fortherapeutic or enzymatic purposes.

The present invention furthermore relates to a modified nucleic acidwith a region capable of transcription, which is derived from awild-type sequence, wherein the region capable of transcription ismodified so that the number of CpG dinucleotides is increased comparedto the wild-type sequence, by using the degeneracy of the genetic code.The modified nucleic acid may be expressed in an expression system asdescribed above, and the region capable of transcription is modified sothat it is codon-optimised in relation to the used expression system,and so that the number of CpG dinucleotides compared to thecodon-optimised sequence derived from the wild-type sequence is raised,using the degeneracy of the genetic code.

A wild-type sequence within the meaning of the present invention is anaturally occurring nucleic acid sequence.

As already mentioned above, it is however also possible for the targetnucleic acid sequence to code for an assembled gene sequence, which maybe assembled from different wild-type sequences. In such a case thewild-type sequence refers to the sequence that has not yet been modifiedwithin the meaning of the present invention (increase or reduction ofthe number of CpG dinucleotides).

The number of CpG dinucleotides in the nucleic acid according to theinvention may, as mentioned above, be increased by several CpGdinucleotides. Preferably the number is raised to the maximum numberthat is possible within the scope of the degeneracy of the genetic code.

The present invention also provides an expression vector, which includesan aforementioned modified nucleic acid according to the invention inoperative coupling with suitable transcription control sequences. Thevector is preferably used to increase the expression in eukaryotic cellsof an arbitrary DNA sequence. The vector is preferably derived fromknown vectors. In the sequence regions of the vector that are differentfrom the modified nucleic acid sequence according to the invention, thenumber of CpG dinucleotides is preferably reduced. Preferably the numberof CpG dinucleotides in these remaining vector sequences or partsthereof is reduced by at least 5%, preferably at least 10%, morepreferably at least 15%, still more preferably at least 25%, inparticular at least 50%, and most particularly preferably at least 75%or more.

The reduction of CpGs is preferably achieved by artificial genesynthesis of the individual vector modules (antibiotic resistance gene,selection marker, multiple cloning site, etc.) as described above. Theindividual modules are assembled with corresponding DNA fragments ofessential, non-alterable modules (replication origin, polyadenylationsite, viral promoter, etc.) using singular restriction sites, to form afunctional vector. The vector may be of viral (e.g. derived fromadenoviruses, retroviruses, Herpes viruses, alpha viruses, etc.) orbacterial origin, or naked DNA (expression plasmids).

The modular construction of the vector moreover permits a rapid andeasily effected alteration as regards the individual modules. The numberof modules may be varied and adapted corresponding to the application.

For a stable integration in cells, elements such as eukaryotic selectionmarkers (e.g resistance genes to hygromycin, zeocin, etc.; selectionreporters such as GFP, LNGFR, etc.; or recombination sequences for adirected recombination) may be used, in which the corresponding genesequences can, as far as possible, also be reduced as regards thecontent of CpGs. For applications in gene therapy sequences can beintroduced that counteract immunostimulating motifs (e.g.immuno-repressive CpG motifs). Accordingly, for applications inimmunisations, such as for example in vaccinations or for the productionof antibodies, sequences may be integrated that containimmunostimulating factors (e.g. immunostimulating CpG motifs).

A preferred vector for use in the present invention is the vectorillustrated in SEQ ID NO. 27.

The present invention also provides eukaryotic cells, more preferablymammalian cells, most particularly preferably human cells, that containa target nucleic acid or a vector (preferably in the form of a DNAconstruct) as described above, in which the nucleic acid or the vectoris present in a form capable of transcription. The cells are preferablysomatic cells or, more preferably, those cells that basically do notcarry out any de novo methylation.

The DNA construct may for example be present episomally or integratedstably into the chromosome. In this connection one or more copies may bepresent in the cell. To introduce the said DNA constructs, gene ferriesof viral (e.g. adenoviruses, retroviruses, Herpes viruses, alphaviruses, etc.) or bacterial origin or naked DNA (expression plasmids)may be used.

The present invention moreover provides an expression system comprising:

-   a) a modified nucleic acid sequence with a region capable of    transcription, which is derived from a wild-type sequence, wherein    the modified nucleic acid sequence has an increased or reduced    number of CpG dinucleotides compared to the wild-type sequence, in    operative coupling with a transcription control sequence, and-   b) an expression environment selected from a cell and a cell-free    expression environment wherein a) can be expressed, in which the    expression system in the case of expression of a modified nucleic    acid sequence with an increased number of CpG dinucleotides exhibits    an increased expression, and in the case of expression of a modified    nucleic acid sequence with a reduced number of CpG dinucleotides    exhibits a reduced expression.

The present invention can thus be used so as to increase or reduce theexpression of a target nucleic acid sequence. If the expression israised, then preferably an increase of the expression of at least 5%,more preferably at least 10%, still more preferably at least 20%, evenmore preferably at least 30%, especially at least 50% and mostespecially at least 100-400% or more, should be achieved. Depending onthe length of the target nucleic acid sequence to be expressed and thenumber of CpG dinucleotides that can be introduced, an increase in theexpression by a factor of 2, 3, 5 or even 10 to 20, or possibly up to100 to 200, may also be achieved.

If a reduction of the expression is desired, then preferably a reductionof the expression, in other words for example a reduction of thetranscript amount of at least 10%, preferably at least 20%, morepreferably at least 30%, still more preferably at least 50% andespecially at least 75%, should be carried out. Preferably theexpression should approach the limit of detection.

As already explained above in detail, the level of transcription dependson the number of CpG dinucleotides in the gene. This means that in thecase of longer genes or in genes with more possibilities of introducingCpG dinucleotides, a higher level of expression should be achieved.Conversely, it should be possible with the aid of the present inventionto reduce the expression significantly by the targeted elimination of asfar as possible all CpG dinucleotides, and depending on the applicationeven to the limit of detection.

The present invention additionally provides medicaments and diagnosticagents based on the modified nucleic acids and/or vectors according tothe invention. The modified nucleic acids and vectors may be used indiagnostic, therapeutic and/or gene therapy applications, in particularalso for the production of vaccines.

In particular the method according to the invention and the expressionsystems, nucleic acid sequences, vectors and cells according to theinvention may be used for the production of DNA vaccines. As analternative to conventional dead vaccines and living vaccines, thedevelopment of vaccines that are based on “naked” plasmid DNA isbecoming increasingly important. The advantage of DNA vaccines lies inan uptake of the DNA in cells, combined with the authentic production(including modification) of antigens and an efficient activation of acellular and humoral immune response. In this connection the level ofthe induced immune response correlates with the amount of antigenproduced and thus with the expression output of the DNA constructs. Ifthe expression of an arbitrary antigen can be increased by theaccumulation of CpG dinucleotides in the coding sequence, then as aresult the activation of the immune system and thus the protectiveeffect is improved.

DESCRIPTION OF THE DIAGRAMS

FIG. 1:

Regulation of the gene expression by methylation (prior art).

FIG. 1A: Methylation of CpG dinucleotides leads to the switching off ofthe gene expression.

FIG. 1B: CpG islands protect against a methylation and the switching offassociated therewith.

FIG. 1C: Secondary hypomethylation of the CpG islands leads to a geneswitching off.

FIG. 1D: Secondary hypomethylation may be prevented by reducing the CpGdinucleotides in the reading frame.

FIG. 2: GFP expression analysis in stably transfected cells.

FIG. 2A and FIG. 2B: Long-time flow cytometry analysis of stablytransfected Flp-In 293T and CHO cells. The Y axis gives theGFP-conditioned fluorescence intensity (MFI “mean fluorescenceintensity”) and the X axis gives the measurement times in weeks aftertransfection.

FIG. 2A: FACS analysis of huGFP and ΔCpG-GFP recombinant 293T cells.

FIG. 2B: FACS analysis of huGFP and ΔCpG-GFP recombinant CHO cells.

FIG. 2C: Fluorescence microscopy image of stable cell lines.

FIG. 3: GFP protein detection in stably transfected cells.

Expression analysis of the GFP reading frame.

Recombinant Flp-In CHO cells that have integrated the huGFP or theΔCpG-GFP gene stably into the cell genome were lysed, and the expressionof the genes was detected by conventional immunoblot analyses. Plots ofthe huGHF, CpG-GFP and mock samples are given. Monoclonal cell lineswere established from both polyclonal cell cultures (poly.) (mono. 14and 7 for ΔCpG-GFP and mono. 10 and 9 for huGFP). Mock cells correspondto an unchanged initial cell population.

FIG. 4:

Quantitative determination of specific transcripts of stable cells.Real-time PCR analysis of specific hygromycin-resistance gene and gfpRNAs from cytoplasmic RNA preparations. The real-time PCR evaluation ofthe LC analyses are shown for CHO cells (hygromycin-resistance FIG. 4Aand gfp FIG. 4B) as well as for 293T cells (hygromycin-resistance FIG.4C and gfp FIG. 4D). The number of PCRcycles (X axis) and thefluorescence intensity (Y axis) are shown. The specific kinetics areshown for huGFP products and ΔCpG-GFP products, as well as for theprimer dimers.

FIG. 5: MIP1alpha expression analysis after transient transfection.

Representative ELISA analysis of the cell lysates and supernatants oftransfected H1299 cells. H1299 cells were transfected with in each case15 μg of wild-type and optimised murine MIP1alpha constructs. Therespective protein concentration was quantified by conventional ELISAtests in the cell supernatant and in the cell lysate with the aid ofcorresponding standard curves. The shaded bars represent the mean valueof the total protein concentration for in each case two independentbatches, while the empty bars correspond to the standard deviation. Thenumber of CpG dinucleotides in the open reading frame is plotted on theX axis and the total protein concentration in μg/ml is plotted on the Yaxis. Wt corresponds to the expression construct of the respectivewild-type gene.

FIG. 6:

MIP1alpha and GM-CSF expression analysis after transient transfection.Representative ELISA analysis of the supernatants of transfected H1299cells. H1299 cells were transfected with in each case 15 μg of wild-typeand optimised human MIP1alpha (FIG. 6A) and GM-CSF (FIG. 6B) constructs.The respective protein concentration in the supernatant of the cellculture 48 hours after transfection was quantified by conventional ELISAtests with the aid of corresponding standard curves. The shaded barsrepresent the mean value for in each case two independent batches, whilethe empty bars correspond to the standard deviation. The number of CpGdinucleotides in the open reading frame is plotted on the X axis and theprotein concentration in the supernatant in μg/ml is plotted on the Yaxis. Wt corresponds to the expression construct of the respectivewild-type gene.

FIG. 7: Diagrammatic illustration of the used expression plasmids.

FIG. 7A: Plasmid map of the P-smallsyn plasmid.

FIG. 7B: Plasmid map of the PC-ref. module and origin of the sequences(wild-type “Wt” in black, and synthetic in grey) are shown.

FIG. 8:

HIV-1 p24 detection after transient transfection.

Expression analysis of the P-smallsyn and Pc-ref vectors. H1299 cellswere transfected with the specified constructs and the proteinproduction was detected by conventional immunoblot analyses. Analysis ofthe cell lysates of HIV-1 p24 transfected H1299 cells. Molecular weights(precision plus protein standard, Bio-Rad) as well as the plot of theR/p24, s/p24 and mock-transfected samples are shown. Mock transfectioncorresponds to a transfection with the original pcDNA3.1 plasmid.

FIG. 9:

HIV-1 p24 expression analysis of various expression constructs. H1299cells were transfected with in each case 15 μg R/p24, R/24ΔCpG, s/p24and s/p24 CpG constructs, as well as with pcDNA3.1 (mock control) inindependent double batches. The respective p24 protein concentration inthe cell lysate was quantified by conventional immunoblot analyses (FIG.9A) and by ELISA tests (FIG. 9B) with the aid of corresponding standardcurves. The shaded bars represent the mean value of the p24concentration (in μg/ml) in the cell lysate for in each case 2independent batches.

EXAMPLES Example 1

Production of GFP Reporter Genes with Different CpG Content

Two variants of green fluorescence protein (GFP) genes, which differ inthe number of CpG dinucleotides, were produced. The huGFP gene had 60CpGs, the CpG-GFP gene had no CpGs. The CpG-depleted gene ΔCpG-GFP wasconstructed artificially. In the design of the &CpG-GFP care was takento ensure that no rare codons or negatively acting cis-active elementssuch as splicing sites or poly(A) signal sites were introduced. Thecodon adaptation index (CAI), which is a measure of the quality of thecodon choice, was altered only slightly by the deletion of the CpGs(CaI(huGFP)=0.95; CAI(ΔCpG-GFP)=0.94). The coding amino acid sequence ofthe GFP was in this connection not altered. Further interfaces wereinserted for the sub-cloning. The nucleotide and amino acid sequencesare given in SEQ ID NO. 1/2.

The sequence was produced as a fully synthetic gene (Geneart GmbH),cloned into the expression vector pcDNA/5FRT (Invitrogen) using theinterfaces HindIII and Bam HI, and placed under the transcriptioncontrol of the cytomegalovirus (CMV) early promoter/enhancer (“pcΔCpG-GFP”).

For the production of a similar expression plasmid, though unchanged inits CpG distribution, the coding region of the humanised GFP gene(huGFP) was amplified by means of a polymerase chain reaction (PCR)using the oligonucleotides huGFP-1 and huGFP-2 from a commerciallyobtainable vector, and likewise cloned into the expression vectorpcDNA/5FRT (“pc-huGFP”, SEQ ID NO. 3/4) using the interfaces HindIII andBam HI.

Production of stable Cell Lines with the GGP Gene Variants

The Flp-In system of Invitrogen was used for a rapid establishment andselection of stable, recombinant cells.

A further, major advantage of this system is a directed integration of acopy of the transgene into a defined locus of the target cell. Thistechnology thus provides the best conditions for the quantitativecomparison of the expression of an arbitrary transgene, sincephysiological and genetic factors of the target cell are largelyidentical. In order to achieve an additional certainty, two differentmammalian cells were selected for these comparative analyses. The celllilnes Fip-In CHO and Fip-In 293T were obtained from Invitrogen andcultured at 37° C. and 5% CO₂. The cell lines were cultured inDulbecco's modified eagle medium high glucose (DMEM) (293T) and HAMs F12(CHO) with L-glutamine, 10% inactivated fetal bovine serum, penicillin(100 U/ml) and streptomycin (100 μg/ml). The cells were sub-cultured ina ratio of 1:10 after confluence was achieved.

The establishment of stably transfected cells was carried out accordingto the manufacturer's instructions. 2.5×10⁵ cells were seeded out in6-well culture dishes and transfected 24 hours later by calciumphosphate co-precipitation (Graham and Eb, 1973) with 1.5 μg transferplasmid and 13.5 μg pOG44. Cells were selected up to a ratio of >90% GFPpositive cells with 100 μg/ml hygromycin for 293T and 500 μg/ml for CHOcells. The number of GFP positive cells was determined for all celllines by means of conventional flow cytometry analysis.

Determination of the GFP expression

The expression of the reporter constructs was determined over a periodof 16 months by regular measurement of the GFP-mediated greenautofluorescence in a flow cytometer (Becton-Dickinson). The data of themean fluorescence intensities are summarised in FIG. 2A (293T cells) and2B (CHO cells). The huGFP expression was found to e relatively constantin both cell lines over the whole measurement period, with a meanfluorescence intensity of 800 (293T) and 700 (CHO). The ΔCpG-GFPreporter construct, with a reduced number of CpGs, likewise exhibited aconstant fluorescence intensity over the whole measurement period. Themean fluorescence intensity was however reduced by a factor of 10-20(293T) and 6-9 (CHO) compared to the huGFP. The reduction of theGFP-mediated fluorescence could also be detected by fluorescencemicroscopy (FIG. 2C).

Since various causes may be involved in a decrease of the GFP-mediatedfluorescence (instability of the protein, reduced nuclear export of RNA,lower transcription rate, etc.) additional western blot analyses andquantitative real-time PCRs were carried out.

For the protein detection by immunoblot, the stable transfected CHOcells were washed twice with ice-cold PBS (10 mM Na₂HP4, 1.8 mM KH₂PO₄,137 mM NaCl, 2.7 mM KCl), scraped off in ice-cold PBS, centrifuged for10 minutes at 300 g, and lysed for 30 minutes in lysis buffer on ice (50mM Tris-HCl, pH 8.0, 0.5% Triton X-100 (w/v)). Insoluble constituents ofthe cell lysate were centrifuged for 30 minutes at 10000 g and 4° C. Thetotal amount of protein in the supernatant was determined by the Bio-Radprotein assay (Bio-Rad, Munich) according to the manufacturer'sinstructions. An equal volume of two-fold sample buffer (Laemmli, 1970)was added to the samples, and heated for 5 minutes at 95° C. 40 μg oftotal protein from cell lysates were separated through a 12.5%SDS/polyacrylamide gel (Laemmlie, 1970), electrotransferred to anitrocellulose membrane and detected with a monoclonal GFP-specificantibody (BD-Bioscience) and a secondary, HRP (horseradish peroxidase)coupled antibody, and identified by means of chromogenic staining.Protein detection by western blot confirmed the data from the FACSmeasurement. For both gene variants the full-length GFP protein wasdetected in stably transfected CHO cells; no differences could bedetected in the processing or proteolytic degradation (FIG. 3).

In order to clarify the transcription activity, a quantitative real-timePCR (Light Cycler, Roche) was carried out for the stably transfected CHOcells. Cytoplasmic RNA was prepared from the cells (RNeasy, Quiagen) andtreated with DNase (500 U Rnase-free DNase/20 μg RNA). 1 μg of theDNase-treated RNA was used as a matrix for a reverse transcription(Random Primed, p(dN)₆, 1^(st) strand C-DNA synthesis kit for RT-PCR,Roche) followed by PCR (RT-oligol and RT-oligo2). The resulting PCRproduct was diluted and used for a light cycler (LC) analysis (SYBR,Roche). As internal control, the RNA amount of the hygromycin-resistancegene similarly integrated into the cell genome was measured. The resultsare summarised in FIG. 4. The RNA amounts of the hygromycin-resistanceshowed no difference in all the measured constructs (FIG. 4A for CHOcells and 4C for 293T cells). The results of the GFP RNA howevercorrelated very well with the results of the protein expression (GFPfluorescence intensity). For the CpG-deleted construct, afterquantification of the light cycler data an approximately seven timessmaller cytoplasmic RNA amount was detected in CHO cells (FIG. 4B) andan approximately thirty times smaller RNA amount was detected in 293Tcells (FIG. 4D), compared to the initial construct.

Example 2

Production of Murine Mip1alpha Genes with Different CpG Contents

In this example the nucleic acid sequence of the murine MiP1alpha genewas altered so as to form a series of constructs with different numbersof CpG dinucleotides, but without altering the coding amino acidsequence. For this purpose the amino acid sequence of the murineMIP1alpha gene product was translated back into syntheticMIP1alpha-coding reading frames, using the codon choice of human cells.In a first series of constructs the accidentally formed CpGdinucleotides were removed stepwise from the sequence, without howeverintroducing rare codons that would be expected to adversely affect theexpression. In addition a CpG dinucleotide-optimised Mip1alpha geneconstruct was produced, which contained twice as many CpG dinucleotidesas the codon-optimised construct. In this case a deterioration of thecodon choice was intentionally taken into account, in order to introduceas many CpG dinucleotides as possible.

According to the prior art it would be expected that this gene constructwould have a lower expression than the codon-optimised gene construct onaccount of its poorer codon choice.

These gene variants were constructed as fully synthetic reading frames,using long oligonucleotides and a stepwise PCR, and cloned into anexpression vector. The produced MIp1alpha vector variants differedcompletely as regards the level of expression of murine MIP1alpha. Forthe person skilled in the art it could not be foreseen that the variantswith the lowest CpGs would be expressed worst, and an increase in theCpGs would be accompanied by an increase of the MiP1alpha expression inmammalian cells. In particular it could not be foreseen by the personskilled in the art that the construct with the maximum possible numberof CpG dinucleotides, which however were introduced at the expense of adeterioration of the codon choice, exhibited a significantly strongerexpression than the codon-optimised gene.

Variants of the murine Mip1alpha gene that differ in the number of CpGdinucleotides were synthetically constructed as described in Example 1and sub-cloned into the expression vector pcDNA3.1 using the interfacesHindIII and NotI. The artificially produced genes were in each casematched as regards their codon choice to the mammalian system. Whenremoving the CpG dinucleotides no rare mammalian codons were used,whereas when inserting CpG dinucleotides above the number ofdinucleotides that are achieved with a normal codon adaptation, rarecodons were intentionally also employed.

The constructs that are codon optimised but provided with differentnumbers of CpG dinucleotides, have throughout a CAI value of more than0.9 and differ only slightly. The CAI values of the wild-type gene, aswell as of the CpG dinucleotide optimised gene (42 CpGs) have on theother hand very low CAI values (below 0.8). According to the prior art acomparable expression of the codon-optimised genes would therefore beexpected, though a significantly lower expression of the wild-type geneand of the CpG dinucleotide optimised gene. The identification of theconstructs, the number of CpGs as well as the CAI values are given inTable 1. The nucleotide and amino acid sequences are given in SEQ ID NO.5/6 to SEQ ID NO. 13/14. The analogous expression construct (wild-typereference construct) corresponding to the wild-type sequence wasunchanged as regards its CpG distribution.

The coding region was amplified by means of a polymerase chain reaction(PCR) using the oligonucleotides mamip-1 and mamip-2 from a cDNA clone(obtained from RZPD) and likewise cloned into the expression vectorpcDNA3.1 using the interfaces HindIII and NotI (“pc-mamip-wt”, SEQ IDNO. 15, GenBank Accession Number AA071899).

Checking the Mip1alpha Expression

In order to quantify the chemokine expression, human H1299 cells weretransfected with the respective expression constructs and the amount ofprotein in the cells and in the cell culture supernatant was measured bymeans of commercial ELISA test kits.

1.5×10⁵ human lung carcinoma cells (H1299) were seeded out in 6-wellcell culture dishes and transfected 24 hours later by calcium phosphateprecipitation with 15 μg of the corresponding expression plasmid. Thecells and cell culture supernatant were harvested 48 hours after thetransfection. The transfected cells were lysed as described in Example 1and the total amount of protein of the cell lysate was determined withthe Bio-Rad protein assay. Insoluble cell constituents were removed fromthe cell culture supernatant by centrifugation at 10000 g for 15 minutesat 4° C.

From 1-5 μg total protein from cell lysates as well as from diluted cellculture supernatants, the expression of MiP1alpha was checked in eachcase in a commercially obtainable ELISA assay (R & D Systems) accordingto the manufacturer's instructions. The total amount of detectableMiP1alpha correlated with the number of CpGs in the reading frame, in acomparable manner to the data of the GFP expression constructs and p24expression constructs. The data are summarised in Table 1. The number ofconstructs permitted for the first time a detailed evaluation of theconnection of the level of expression with the number of CpGs within thecoding region.

A representative result of an evaluation by means of cytokine ELISA isshown in FIG. 5. The shaded bars correspond to the mean value of twoindependent transfection batches, while the empty bars represent therespective standard deviations.

The relative protein amounts of two independent transient transfectionexperiments (in double batches) referred to the wild-type construct arelisted in Table 1. These results demonstrate a marked reduction of theprotein expression with the decrease in CpG dinucleotides and a markedincrease compared to the wild-type gene and to the codon-optimisedgenes, correlating with the additional introduction of such motifs anddespite a deterioration of the codon matching.

TABLE 1 Expression comparison of murine MIP1alpha genes SEQ ID ConstructNO. Expression* St. Dev.** CpG No. CAI*** pc-maMIP wt 15 100% 4% 8 0.76pc-maMIP 0 5  2% 9% 0 0.92 pc-maMIP 2 7  8% 27%  2 0.93 pc-maMIP 4 9  7%33%  4 0.93 pc-maMIP 13 11 146% 5% 13 0.97 pc-maMIP 42 13 246% 4% 420.72 *Percentage mean value of the amount of protein from 2 tests (indouble batches) in relation to the total amount of protein of thewild-type construct (maMIP wt) **Standard deviation ***Codon adaptationindex

Example 3

Production of Human and Murine Cytokine Genes with Different CpGContents

In order to be able to further confirm the hitherto obtained results andinterpretations, variants of the human MIP1alpha gene, of the humanGM-CSF gene, of the human IL-15 gene and of the murine GM-CSF gene,which differ in the number of CpG dinucleotides from the wild-type gene,were artificially constructed similarly to Example 2 and sub-cloned intothe expression vector pcDNA3.1 using the interfaces HindIII and NotI.The identification of the constructs, number of CpGs as well as the CAIvalues are given in Table 2. The nucleotide and amino acid sequences ofthe wild-type sequences (wt) and of the sequences with an altered numberof CpG dinucleotides are given in SEQ ID NO. 17/18 to SEQ ID NO. 23/24and SEQ ID NO. 48/49 to SEQ ID NO. 54/55. The expression constructs wereamplified by means of a polymerase chain reaction (PCR) using theoligonucleotides humip-1 and humip-2, hugm-1 and hugm-2, huil-1 andhuil-2, magm-1 and magm-2 from corresponding cDNA clones (obtained fromRZPD) and were cloned into the expression vector pcDNA3.1, likewiseusing the interfaces HindIII and NotI (“pc-huMiP-wt”, GenBank AccessionNumber NM_021006, “pc-huGM-wt”, GenBank Accession Number M11220,“pc-huIL-wt”, GenBank Accession Number BC018149, “pc-muGM-wt”, GenBankAccession Number NM_049969 with a deviation).

Checking the Cytokine Expression

In order to quantify the cytokine expression human cells weretransfected with the respective expression constructs and the amount ofprotein in the cell culture supernatant was measured by means ofcommercial ELISA test kits.

As described in Example 2, H1299 cells were transfected transiently with15 μg of the corresponding expression plasmid. The cell culturesupernatant was harvested for 48 hours after the transfection. Insolublecell constituents were removed from the cell culture supernatant bycentrifugation.

From dilute cell culture supernatants the expression of human MIP1alpha,human GM-CSF and IL-15 and murine GM-CSF was checked in each case in acommercially obtainable ELISA assay (R & D Systems for MIP1alpha; BDPharmingen for GM-CSF and IL-15). In a comparable way to the data of theaforementioned expression constructs, the total amount of detectablecytokines in the culture supernatant correlated with the number of CpGsin the reading frame. The data are summarised in Table 2. Arepresentative result of an evaluation by means of cytokine ELISA isshown in FIG. 6. The shaded bars correspond to the mean value of twoindependent transfection batches, while the empty bars represent therespective standard deviation.

The relative amounts of protein in each case from a transienttransfection experiment (in double batches) referred to the wild-typeconstruct are listed in Table 2. Similarly to the results from Example2, these results too-confirm a marked increase in protein production,correlating with the additional introduction of such motifs, comparedwith the wild-type genes.

TABLE 2 Expression comparison of human cytokine/chemokine genes SEQ IDConstruct NO. Expression* CpG No. CAI** pc-huMiP wt 21 100% 8 0.76pc-huMiP 43 17 393% 43 0.72 pc-huGM wt 23 100% 10 0.82 pc-huGM 63 19327% 63 0.70 pc-huIL wt 56 100% 3 0.65 pc-huIL 21 52 313% 21 0.98pc-muGM wt 58 100% 11 0.75 pc-muGM 62 54 410% 62 0.75 *Percentage meanvalue of the amount of protein from in each case one experiment, indouble batches, in relation to the total amount of protein of thecorresponding wild-type construct (denoted wt). **Codon adaptation index

Example 4

Production of a Plasmid with a Reduced Number of CpG Dinucleotides toIncrease the Expression

The nucleic acid sequence of the plasmid pcDNA5 (Invitrogen) was used asa basis for the production of a modularly constructed plasmid in whichthe number of CpG dinucleotides had been reduced as far as possible. TheDNA sequence which codes for the ampicillin resistance gene (bla) wassynthetically produced as described in Example 1, and sub-cloned usingthe restriction interfaces ClaI and BglII. The number of CpGs was inthis connection reduced from 72 to 2. Likewise, the multiple cloningsite was redesigned, synthetically constructed, and sub-cloned using therestriction interfaces SacI and PmeI, whereby the number of CpGs wasreduced from 11 to 1. The CMV promoter (31 CpGs), the BGHpolyadenylation site (3 CPGs) and the pUC replication origin (45 CpGs)were integrated unchanged into the plasmid. The hygromycin-resistancecassette was deleted. The CMV promoter was cloned by PCR amplificationwith the oligonucleotides CMV-1 and CMV-2, which in addition added aClaI and a SacI restriction interface 3′ and 5′. In a similar way pUCori-1 was amplified with the oligonucleotides ori-1 (contains XmaIinterface) and ori-2 (contains BglII interface), and the BGHpolyadenylation site was amplified with the oligonucleotides pa-1 (PmeI)and pa-2 (XmaI) by PCR, and sub-cloned using the correspondingrestriction enzymes. The plasmid pcDNA5 was used as a template in allPCR reactions. The structure of this plasmid is shown diagrammaticallyin FIG. 7A (“P-smallsyn”), and the complete sequence is given in SEQ IDNO. 25.

In order to investigate the influence of the number of CpGs in thevector on the level of expression of a transcript, the reference vectorwas modified so that it could be used as control. By PCR amplificationusing the oligonucleotides ref-del-1 and ref-del-2, which in each caseintroduced a NsiI restriction interface at the 5′ end, cleavage withNsiI and ligation, the hygromycin-resistance cassette was removed fromthe plasmid pcDNA5 (see diagram 6B, “Pc-ref”).

The p24 capsid protein derived from HIV-1 was used as test transcript.The coding region of p24 already previously optimised for expression inhuman cells (Graf et al., 2000) was amplified by means of PCR using theoligonucleotides p24-1 and p24-2 from an HIV-1 syngag construct (Graf etal., 2000) and cloned into the two comparison vectors using theinterface HindIII and Bam HI (“R/p24” and “s/p24”).

Checking the HIV-1 p24 expression in different vector backgrounds

In order to check the influence of the CpG number in the vector from theexpression of the transcript, the constructs R/p24 and s/p24 weretransiently transfected into human cells and the expression of p24 wasanalysed.

As described in Example 2, H1299 cells were transfected transiently with15 μg of the corresponding expression plasmid. Cells were harvested 48hours after the transfection. The transfected cells were lysed asdescribed in Example 1 and the total amount of protein in thesupernatant was determined with the Bio-Rad protein assay. 50 μg oftotal protein from cell lysates were tested as described in Example 1 ina western blot analysis with a monoclonal p24-specific antibody, 13-5(Wolf et al., 1990) (FIG. 8). In two independent transfection batches amarkedly higher p24 expression was detected after transfection of thesmallsyn construct (s/p24).

Production of HIV p24 genes with different CpG contents Two variants ofthe capsid protein gene p24 derived from HIV-1, which differ in thenumber of CpG dinucleotides, were produced. The syn p24 gene had 38CpGs, whereas the p24ΔCpG gene had no CpGs. The CpG-depleted genep24ΔCpG was artificially constructed as described in Example 1 andcloned into the expression vector P-smallsyn (described in Example 4)(“s/p24ΔCpG”) and into the reference vector Pc-ref (“R/p24&CpG”) usingthe interfaces HindIII and Bam HI. The nucleotide and amino acidsequences of p24&CpG are given in SEQ ID NO. 26/27. The plasmids R/p24and s/p24, which are described in Example 4, were used as referenceconstructs.

Checking the HIV-1 p24 Expression

In order to check the influence of the CpG number in the vector and inthe insert (transcript), the constructs R/p24, R/p24ΔCpG, s/p24 ands/p24ΔCpG were transfected transiently into human cells and theexpression of p24 was analysed.

As described in Example 2, H1299 cells were transiently transfected with15 μg of the corresponding expression plasmid. Cells were harvested 48hours after the transfection. The transfected cells were lysed asdescribed in Example 1 and the total amount of protein in the lysate wasdetermined with the Bio-Rad protein assay. 50 μg of total protein fromcell lysates were checked as described in Example 1 in a western blotanalysis with a monoclonal p²⁴-specific antibody 13-5 for the expressionof p24 (FIG. 9A). As was already shown in Example 4, the use of theCpG-deleted vector P-smallsyn in the identical transgene led to avisible increase in p24 production (comparison R/p24 and s/p24).Comparably to the data of the GFP and cytokine/chemokine expressionconstructs, the amount of detectable p24 in the cell lysate, using theidentical vector background, correlated with the number of CpGs in thereading frame (comparison R/24 and R/−p24ΔCpG as well as s/p24 ands/p24ΔCpG). The data were confirmed in a p24-specific ELISA test (FIG.9B). The construct with 38 CpGs (p²4) had a ca. 2.5 times (Pc-ref) orca. 25% (P/smallsyn) larger amount of p24 than the construct withoutCpGs (p24 DcpG). The results are illustrated in FIG. 9.

The correlation of the protein production with the number of CpGdinucleotides could be demonstrated in the Examples mentioned here. Theselected genes are derived from such different organisms as a jellyfish,a human pathogenic virus, and mammals. It is therefore obvious to regardthis mechanism as generally valid. The examples demonstrate furthermorethat this correlation in vitro is valid both in the case of a transienttransfection as well as in stable recombinant cells. The methoddescribed here, namely to alter in a targeted manner the gene expressionin eukaryotes by targeted modulation of the CpG dinucleotides, both inthe coding region as well as in the vector background, may consequentlybe used for the production of biomolecules for biotechnological,diagnostic or medical applications. Description of the sequences

1. Oligonucleotides SEQ ID NO. Identification Sequence 5′-3′ 28 huGFP-1CAATAAGCTTGCCACCATGGTGAGCAAGGGCG AG 29 huGFP-2AGTAGGATCCTATTACTTGTACAGCTCGT 30 RT-oligo1 CCCTGAAGTTCATCTGCACC 31RT-oligo2 GATCTTGAAGTTCACCTTGATG 32 mamip-1CAGGTACCAAGCTTATGAAGGTCTCCACCACT GC 33 mamip-2CAGAGCTGGAGTCATGAAGACTAGGCATTCAG TTCCAGGTCAG 34 hugm-1CAGGTACCAAGCTTATGTGGCTGCAGAGCCT GC 35 hugm-2CAGAGCTCGAGTCATGAAGACTACTCCTGGAC TGGCTCCCAGC 36 humip-1CAGTACCAAGCTTATGCAGGTCTCCACTGCT GC 37 humip-2CAGAGCTCGAGTCATGAAGACTAGGCACTCAG CTCCAGGTCACTG 38 p24-1ACTAGGTACCATCTAAGCTTATGCCCATCGTG CAGAACATCCA 39 p24-2TCAAGAGCTCGACTGGATCCTATTACAGCACC CTGGCCTTGTGGC 40 CMV-1CAAAGGTACCGTTAATCGATGTTGACATTGA TTATTGACTA 41 CMV-2GAATGAGCTCTGCTTATATAGACC 42 ori-1 GTCACCCGGGTAGTGAATTCATGTGAGCAAA AGGC43 ori-2 GATCTTTTCTACGGGAGATCTGTCAATCGAT AGCT 44 pa-1GTTAGAGCTCCAGTGTTTAAACCTGTGCCTTC TAGTTGCCAG 45 pa-2CAAACCTACCGATACCCGGGCCATAGAGCCC ACCGCATC 46 ref-del-1TCAGATGCATCCGTACGTTAACATGTGAGCAA AAGGCCAGCA 47 ref-del-2AGTCATGCATCCATAGAGCCCACCGCATCCC CA 48 hull-1CAGGTACCAAGCTTATGAGAATTTCGAAACC AC 49 hull-2CAGAGCTCGAGTCATGAAGACTAAGAAGTGTT GATGAACATTTGG 50 magm-1CAGGTACCAAGCTTATGGCCCACGAGAGAAAG GC 51 magm-2CAGAGCTCGAGTCATGAAGACTATTTTTGGCC TGGTTTTTTGC 2. Polypeptide-codingsequences and vector sequences SEQ ID NO. 1 + 2: ΔCpG-GFP (nucleic acid+ polypeptide) ATGGTGTCCAAGGGGGAGGAGCTGTTCAGAGGGGTGGTGCCCATCCTGGTGGAGCTGGATGGGGATGTGAATGGCCACAAGTTCTCTGTGTGTGGGGAGGGGGAGGGGGATGCCAGCTATGGCAAGCTCACCCTGAAGTTCATCTGCACCAGAGGCAAGCTGCCAGTGCCCTGGCCCACCCTGGTGACCACCTTCACCTATGGGGTGCAGTGCTTCAGCAGATACCCAGACCACATGAAGCAGCATGACTTCTTCAAGTCTGCCATGCCTGAGGGCTATGTGCAGGAGAGGACCATCTTCTTCAAGGATGATGGCAACTACAAGACCAGGGCTGAGGTGAAGTTTGAGGGGGATAGCGTGGTGAACAGGATTGAGCTGAAGGGCATTGACTTTAAGGAGGATGGCAATATCCTGGGCCACAAGCTGGAGTACAACTACAACAGGCACAATGTGTACATCATGGCAGACAAGCAGAAGAATGGCATCAAGGTGAACTTGAAGATCAGGCACAACATTGAGGATGGCTCTGTGCAGGTGGCAGACCACTACCAGCAGAACACCCCCATTGGAGATGGCCCTGTCCTGGTGCCAGACAACCAGTACCTGAGCAGCGAGTGTGCCCTGAGCAAGGACCCCAATGAGAAGAGGGACCACATGGTGCTGCTGGAGTTTGTGACAGCTGCTGGCATCACCCTGGGCATGGATGAGCTGTACAAG TGA SEQ ID NO. 3 + 4:huGFP (nucleic acid + polypeptide)ATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGGAAGCTGACCCTGAAGTTCATCTGCAGGAGCGGGAAGCTGCCCGTGGCGTGGCCCACCGTCGTGACCACCTTCACCTACGGCGTGCAGTGGTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTGAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAAGTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCATGAAGGTGAACTTCAAGATGCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCGGGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCACATGGTCGTGCTGGAGTTCGTGACCGCCGCCGGGATCACTGTCGGCATGGACGAGCTGT ACAAGTAA SEQ ID NO. 5+ 6: murine MIP1alpha-0 CpG (nucleic acid + polypeptide)ATGAAGGTGAGCACAAGAGGTCTGGGTGTGGTGCTGTGTACCATGACCCTGTGCAACCAGGTGTTCTCTGCCCGTTATGGAGCAGATACCCGTACAGCCTGCTGTTTCAGCTAGAGCAGGAAGATCCCCAGGCAGTTCATTGTGGACTACTTTGAGAGGAGGAGCCTGTGTTCTCAGCCTGGGGTGATCTTTCTGACCAAGAGGAACAGGCAGATCTGTGCAGACAGGAAGGAGACATGGGTGCAGGAGTACATCACAGACCTGGAGCTGAATGCCTAG SEQ ID NO. 7 + 8: murineMIP1alpha-2 CpG (nucleic acid + polypeptide)ATGAAGGTGAGCAGAACAGCTCTGGCCGTGCTGCTGTGTACCATGACCCTGTGCAACCAGGTGTTCTCTGCCCGTTATGGAGCAGATACGGCTACAGCCTGCTGTTTCAGGTACAGGAGGAAGATGGCGAGGGAGTTGATCGTGGACTACTTTGAGACCAGCAGCCTGTGTTCTCAGCCTGGGGTGATCTTTCTGACCAAGAGGAACAGGCAGATCTGTGCAGACAGCAAGGAGACATGGGTGCAGGAGTACATCACAGACCTGGAGCTGAATGCCTAG SEQ ID NO. 9 + 10: murineMIP1alpha-4CpG (nucleic acid + polypeptide)ATGAAGGTGAGCACAACAGCTCTGGCCGTGCTGCTGTGTACCATGACCCTGTGCAACCAGGTGTTCTCTGCCCCTTACGGAGCAGATACCCCTACAGCCTGCTGTTTCAGCTACAGCAGGAAGATCCCCAGGCAGTTCATCGTGGACTACTTTGAGACCAGCAGGCTGTGTTCTCAGCCTGGGGTGATCTTTCTGACCAAGAGGAACCGCCAGATCTGTGCAGACAGCAAGGAGACATGGGTGCAGGAGTACATCACAGACCTGGAGCTGAATGCCTAG SEQ ID NO. 11 + 12: murineMIP1alpha-13CpG (nucleic acid + polypeptide)ATGAAGGTGAGCACCACAGCTCTGGCTGTGCTGCTGTGCACCATGACCCTGTGCAACCAGGTGTTCAGCGCTCCTTACGGCGCCGATACCCCTACAGCCTGCTGCTTCAGCTACAGCAGGAAGATCCCCAGGCAGTTCATCGTGGACTACTTCGAGACCAGCAGCCTGTGTTCTCAGCCCGGCGTGATCTTCCTGACCAAGCGGAACAGACAGATCTGCGCCGACAGCAAGGAGACATGGGTGCAGGAGTACATCACCGACCTGGAGCTGAACGCCTAG SEQ ID NO. 13 + 14: murineMIP1alpha-42 CpG (nucleic acid + polypeptide)ATGAAGGTGTCGACGACCGCGCTCGCCGTGCTGCTGTGCACGATGACGCTGTGCAACCAGGTGTTCAGCGCCCCGTACGGCGCCGACACGCCGACCGCGTGCTGCTTCTCGTACTCGCGGAAGATCCCGCGGCAGTTCATCGTCGACTACTTCGAAACGTCGTCGCTGTGCTCGCAGCCCGGCGTGATCTTCCTCACGAAGCGGAACCGGCAGATCTGCGCCGACTCGAAGGAAACGTGGGTGCAGGAGTACATCACCGACCTCGAACTGAACGCGTAG SEQ ID NO. 15 + 16: murineMIP1alpha wild-type (7 CpG) (nucleic acid + polypeptide)ATGAAGGTCTCCACCACTGCCCTTGCTGTTCTTCTCTGTACCATGACACTCTGCAACCAAGTCTTCTCAGCGCCATATGGAGCTGACACCCCGACTGCCTGCTGCTTCTCCTACAGCCGGAAGATTCCACGCCAATTCATCGTTGACTATTTTGAAACCAGCAGCCTTTGCTCCCAGCCAGGTGTCATTTTCCTGACTAAGAGAAACCGGCAGATCTGCGCTGACTCCAAAGAGACCTGGGTCCAAGAATACATCACTGACCTGGAACTGAATGCCTAG SEQ ID NO. 17 + 18: humanMIP1alpha-43CpG (nucleic acid + polypeptide)ATGCAAGTGTCGACCGCCGCTCTCGCCGTGCTGCTGTGCAGGATGGCGCTGTGCAACCAAGTGCTGAGCGCGCCTCTCGCCGCCGACACGCCGACCGCGTGCTGCTTCTCGTACACGTCGCGGCAGATCCCGCAGAACTTCATCGCCGACTACTTCGAGACGTCGTCGCAGTGCTCGAAGCCGAGCGTGATCTTCCTGACGAAGGGCGGACGGCAAGTGTGCGCCGACCCGAGCGAGGAGTGGGTGCAGAAGTACGTGAGCGACCTCGAACTGAGCGCGTAG SEQ ID NO 19 + 20: humanGM-CSF-63CpG (nucleic acid + polypeptide)ATGTGGCTGCAGTCGCTGCTGCTGCTCGGAACCGTCGCGTGTTCGATCAGCGCGCCTGCGCGGTCGCCGTCGCCGTCGACGCAGCCGTGGGAGCACGTGAACGCGATCCAGGAGGCGCGACGGCTGCTGAACCTGTCGCGCGATACAGCCGCCGAGATGAACGAGACCGTCGAGGTGATCAGCGAGATGTTCGACCTGCAGGAGCCGACGTGCCTGCAGACGCGGCTCGAACTGTATAAGCAGGGCCTCCGCGGCTCGCTCACGAAGCTGAAGGGCCCGCTCACGATGATGGCGTCGCACTACAAGCAGCACTGCCCGCCGACGCCCGAAACGTCGTGCGCGACGCAGATCATCACGTTCGAGTCGTTCAAGGAGAACCTGAAGGACTTCCTGCTCGTGATCCCGTTCGATTGCTGGGAGCCCGT GCAGGAGTAG SEQ ID NO.21 + 22: human MIP1alpha wild-type (8CpG) (nucleic acid + polypeptide)ATGCAGGTCTCCACTGCTGCCCTTGCCGTCCTCCTCTGCACCATGGCTCTCTGCAACCAGGTCCTCTCTGCACCACTTGCTGCTGACACGCCGACCGCCTGCTGCTTCAGCTACACCTCCCGACAGATTCCACAGAATTTCATAGCTGACTACTTTGAGACGAGCAGCCAGTGCTCCAAGCCCAGTGTCATCTTCCTAACCAAGAGAGGCCGGCAGGTCTGTGCTGACCCCAGTGAGGAGTGGGTCCAGAAATACGTCAGTGACCTGGAGCTGAGTGCCTAG SEQ ID NO. 23 + 24: humanGM-CSF wild-type (10CpG) (nucleic acid + polypeptide)ATGTGGCTGCAGAGCCTGCTGCTCTTGGGCACTGTGGCCTGCAGCATCTCTGCACCCGCCCGCTCGCCCAGCCCCAGCACGCAGCCCTGGGAGCATGTGAATGCCATCCAGGAGGCCCGGCGTCTCCTGAACCTGAGTAGAGACACTGCTGCTGAGATGAATGAAACAGTAGAAGTCATCTCAGAAATGTTTGACCTCCAGGAGCCGACCTGCCTACAGACCCGCCTGGAGCTGTACAAGCAGGGCCTGCGGGGCAGCCTCACCAAGCTCAAGGGCCCCTTGACCATGATGGCCAGCCACTACAAGCAGCACTGCCCTCCAACCCCGGAAACTTCCTGTGCAACCCAGATTATCACCTTTGAAAGTTTCAAAGAGAACCTGAAGGACTTTCTGCTTGTCATCCCCTTTGACTGCTGGGAGCCAGTCCAGGA GTAG SEQ ID NO. 25:P-smallsyn (nucleic acid sequence of the plasmid)ATCGATGTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTCTCTGGCTAACTAGAGAACCCACTGCTTACTGGCTTATCTAAATTAATACGACTCACTATAGGGAGACCCAAGCTGTTAAGCTTGGTAGATATCAGGGATCCACTCAGCTGATCAGCCTCCAGTTTAAACCTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGACAATAGCAGGCATGCTGGGGATGCGGTGGGCTCTATGGCCCGGGTAGTGAATTCATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCGTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGTATTTGGTATCTGCTCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGAGATCTGTCTGACTCTGAGTGGAACCAAAACTCATGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGGTTAACTTACCAATGCTTAATCAATGAGGCACCAATCTCTGCAATCTGCCTATTTCTCTCATCCATGGTTGCCTGACTGCCTGTGGTGTAGATAACTACAATCCTGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCTCTAGACCCTCTCTCACCTGCTCCAGATTTATCTGCAATGAACCAGCCAGCTGGAAGGGCAGACCTGAGAAGTGGTCCTGCAACTTTATCTGCCTCCATCCAGTCTATTAATTGTTGTCTGGAAGGTAGAGTAAGCAGTTCACCAGTTAATAGTTTCCTCAAGGTTGTTGCCATTGCTACAGGCATGGTGGTGTCCCTCTCATCATTTGGTATGGCTTCATTCAGCTCTGGTTCCCATCTATCAAGCCTAGTTACATGATCACCCATGTTGTGCAAAAAAGCAGTCAACTCCTTTGGTCCTCCAATGGTTGTCAAAAGTAAGTTGGCAGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCTGTAAGATGCTTTTCTGTGACTGGACTGTACTCAACCAAGTCATTCTGAGAATAGTGTATTCTTCTACCCAGTTGCTCTTGCCCAGCATCAATTCTGGATAATACTGCACCACATAGCAGAACTTTAAAGGTGCTCATCATTGGAAATCTTTCTTCTGGTCTAAAACTCTCAAGGATCTTACCAGAGTTGAGATCCAGTTCAATGTAACCCACTCTTGGACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGGGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAAGGCAGCAAAAAAGGGAATAAGGGCAACTCTGAAATGTTGAATACTCATAGTACTACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTATGCATTCAGCTCACATTTCCCTGAAAAGTGCCACCTGAAATTGACTGATAGGGAGTTCTCCCAATCCCCTATGGTGCACTCTCAGTACAATCTGCTCTGATGCCTCATAGTTAAGCCAGTATCTGCTCCCTGCTTGTGTGTTGGAGGTCACTGAGTAGTGGGCTAGCAAAATTTAAGCTACAACAAGGCAAGGCTTGACCTACAATTGCATGAAGAATCTGCTTAGGGTTAGGCCTTTTGCACTGCTTGGAGATGTACTG GCCAGATATACTA SEQ IDNO. 26 + 27: p24ΔCpG (nucleic acid + polypeptide)ATGGTGCACCAGGCCATCAGCCCCAGGACCCTGAATGCCTGGGTGAAGGTGGTGGAGGAGAAGGCCTTCAGCCCTGAGGTGATCCCCATGTTCTCTGCCCTGTCTGAGGGGGCCACCCCCCAGGACCTGAACACCATGCTGAACACAGTGGGGGGCCACCAGGCTGCCATGCAGATGCTGAAGGAAACCATCAATGAGGAGGCTGCTGAGTGGGACAGAGTGCACCCTGTGCATGCTGGCCCCATTGCCCCTGGCCAGATGAGGGAGCCCAGGGGCTCTGACATTGCTGGCACCACCTCCACCCTGCAGGAGCAGATTGGCTGGATGACCAACAACCCCCCCATCCCTGTGGGGGAGATCTACAAGAGATGGATCATCCTGGGCCTGAACAAGATTGTGAGGATGTACAGCCCCACCTCCATCCTGGACATCAGGCAGGGCCCCAAGGAGCCCTTCAGGGACTATGTGGACAGGTTCTACAAGACCGTGAGGGCTGAGCAGGCCAGCCAGGAGGTGAAGAACTGGATGACAGAGACCCTGCTGGTGCAGAATGCCAACCCTGACTGCAAGACCATCCTGAAGGCCCTGGGCCCAGCTGCCACCCTGGAGGAGATGATGACAGCCTGCCAGGGGGTGGGAGGCCCTGGCCACAAGGCCAGGGTGCT GTAA SEQ ID NO. 52 + 53:human IL-15-21CpG ATGCGGATCAGCAAGCCCCACCTGAGGAGCATCAGCATCCAGTGCTACCTGTGCCTGCTGCTGAACAGCCACTTCCTGACAGAGGCCGGCATCCACGTGTTTATCCTGGGCTGCTTCTCTGCCGGCCTGCCTAAGAGAGAGGCCAACTGGGTGAACGTGATCAGCGACCTGAAGAAGATCGAGGACCTGATCCAGAGCATGCACATCGACGCCACCCTGTACACAGAGAGCGACGTGCACCCTAGCTGTAAGGTGACCGCCATGAAGTGCTTCCTGCTGGAGCTGCAGGTGATCAGCCTGGAGAGCGGCGATGCCAGCATCCACGACACCGTGGAGAACCTGATCATCCTGGCCAACAACAGCCTGAGCAGCAACGGCAATGTGACCGAGAGCGGCTGCAAGGAGTGTGAGGAGCTGGAGGAGAAGAACATCAAGGAGTTCCTGCAGAGCTTCGTGCACATCGTGCAGATGTTCATCAA CACCAGCTAG SEQ ID NO.54 + 55: murine GM-CSF-62CpGATGTGGCTGCAGAACCTGCTGTTCCTCGGCATCGTCGTGTACTCGCTGAGCGCGCCGACGCGCTCGCCGATCACCGTGACGCGGCCGTGGAAGCACGTCGAGGCGATCAAGGAGGCGCTGAACCTGCTCGACGACATGCCCGTGACGCTGAACGAGGAGGTCGAGGTCGTGTCGAACGAGTTCTCGTTCAAGAAGCTGACGTGCGTGCAGACGCGGCTGAAGATCTTCGAGCAGGGCCTGCGCGGCAACTTCACGAAGCTGAAGGGCGCGCTGAACATGACCGCGTCGTACTACCAGACGTACTGCCCGCCGACGCCCGAGACCGATTGCGAGACGCAGGTGACGACGTACGCCGACTTCATCGACTCGCTGAAGACGTTCCTGACCGACATCCCGTTCGAGTGCAAGAAGCCCGGCCAGAAGTAG SEQ ID NO. 56 + 57: humanIL-15 wild-type (3CpG)ATGAGAATTTCGAAACCACATTTGAGAAGTATTTCCATCCAGTGCTACTTGTGTTACTTCTAAACAGTCATTTTCTAACTGAAGCTGGCATTCATGTCTTCATTTTGGGCTGTTTCAGTGCAGGGCTTCCTAAAACAGAAGCCAACTGGGTGAATGTAATAAGTGATTTGAAAAAAATTGAAGATCTTATTCAATCTATGCATATTGATGCTACTTTATATACGGAAAGTGATGTTCACCCCAGTTGCAAAGTAACAGCAATGAAGTGCTTTCTCTTGGAGTTACAAGTTATTTCACTTGAGTCCGGAGATGCAAGTATTCATGATACAGTAGAAAATCTGATCATCCTAGCAAACAACAGTTTGTCTTCTAATGGGAATGTAACAGAATCTGGATGCAAAGAATGTGAGGAACTGGAGGAAAAAAATATTAAAGAATTTTTGCAGAGTTTTGTACATATTGTCCAAATGTTCATCAACACTTCTTAG SEQ ID NO. 58 + 59: murineGM-CSF-wild-type (11CpG)ATGTGGCTGCAGAATTTACTTTTCCTGGGCATTGTGGTCTACAGCCTCTCAGCACCCACCCGCTCACCCATCACTGTCACCCGGCCTTGGAAGCATGTAGAGGCCATCAAAGAAGCCCTGAACCTCCTGGATGACATGCCTGTCACATTGAATGAAGAGGTAGAAGTCGTCTCTAACGAGTTCTCCTTCAAGAAGCTAACATGTGTGCAGACCCGCCTGAAGATATTCGAGCAGGGTCTACGGGGCAATTTCACCAAACTCAAGGGCGCCTTGAACATGACAGCCAGCTACTACCAGACATACTGCCCCCCAACTCCGGAAACGGACTGTGAAACACAAGTTACCACCTATGCGGATTTCATAGACAGCCTTAAAACCTTTCTGACTGATATCCCCTTTGAATGCAAAAAACCAGGCCAAAAATAG

REFERENCES

-   Aklyama, Y., Maesawa, C., Ogasawara, S., Terashima, M. and    Masuda, T. (2003) Cell-type-specific repression of the maspin gene    is disrupted frequently by demethylation at the promoter region in    gastric intestinal metaplasia and cancer cells, Am. J. Pathol. 163,    1911-1919.-   Antequera, F. and Bird, A. (1993) Number of CpG islands and genes in    human and mouse, Proc. NatI. Acad. Sci. U.S.A 90, 11995-11999.-   Ausubel, F. M., Brent, R., Kingston, R. E., Moore, d. d.,    Seidman, J. G., Smith, J. A. and Struhl, K. (1994) Percentage of    Kodon Synonymous Usage and Frequency of Kodon Occurrence in Various    Organisms, Current Protocols in Molecular Biology 2, A1.8-A1.9.-   Bird, A. P. (1980) DNA methylation and the frequency of CpG in    animal DNA, Nucleic Acids Res. 8, 1499-1504.-   Chevalier-Marietts, C., Henry, I., Montfort, L, Capgras, S.,    Forlani, S., so Muschler, J. and Nicolas, J. F. (2003) CpG content    affects gene silencing in mice: evidence from novel tranagenes,    Genome Biol. 4, R53.-   Choi, Y. S., Kim, S., Kyu, L H., Lee, K. U. and Pak, Y. K. (2004) in    vitro methylation of nuclear respiratory factor-1 binding site    suppresses the a promoter activity of mitochondrial transcription    factor A, Biochem. Biophys. Res. Commun. 314, 118-122.-   Deml L, Bojak A., Stock S., Graf M., Wild J., Schirmbeck R., Wolf    H., Wagner R. (2003) Multiple Effects of Codon Usage Optimization on    so Expression and Immunogenicity of DNA Candidate Vaccines Encoding    the Human Immunodeficency Virus Type 1 Gag Gene, J. Virol. 75 No.    22, 10999-11001.-   Deng, G., Chen, A., Pong, E. and Kim, Y. S. (2001) Methylation in    hMLH1 promoter interferes with its binding t transcription factor    CBF and inhibits gene expression, Oncogene 20, 7120-7127.-   Duan J. and Antezana A. (2003) Mammalian Mutation Pressure,    Synonymous Codon Choice, and mRNA Degradation, J. Mol. Evol. 57,    694-701-   Hendrich, B. and Bird, A. (1998) Identification and characterization    of a family of mammalian methyl-CpG binding proteins, Mol. Cell    Biol. 18, 6538-6547.-   Hisano, M., Ohta, H., Nishimune, Y. and Nozaki, M. (2003)    Methylation of CpG dinucleotides in the open reading frame of a    testicular germ cell-specific intronless gene, Tact1/Acti7b,    represses its expression in somatic cells, Nucleic Adds Res. 31,    4797-4804.-   Hsieh, C. L (1994) Dependence of transcriptional repression on CpG    methylation density, Mol. Cell Biol. 14, 5487-5494.-   Ivanova, T., Vinokurova, S., Petrenko, A., Eshilev, E., Solovyova,    N., Kisseljov, F. and Kisseljova, N. (2004) Frequent    hypermethylation of 5′ flanking region of TIMP-2 gene in cervical    cancer, Int. J. Cancer 108, 882-886.-   Jones, P. L, Veenstra, G. J., Wade, P. A., Vennaak, D., Kass, S. U.,    Landsberger, N., Strouboulis, J. and Wolffe, A. P. (1998) Methylated    DNA and MeCP2 recruit histone deacetylase to repress transcription,    Nat. Genet. 19, 187-191.-   Kang, G. H., Lee, S., Lee, H. J. and Hwang, K. 8. (2004) Aberrant    CpG Island hypermethylation of multiple genes in prostate cancer and    prostatic intraepithelial neoplasia, J. Pathol. 202, 233-240.-   Kudo, S. (1998) Methyl-CpG-binding protein MeCP2 represses    Sp1-activated transcription of the human leukosialin gene when the    promoter is methylated, Mol. Cell Biol. 18, 5492-5499.-   Laemmli, U. K. (1970) Cleavage of structural proteins during the    assembly of the head of bacteriophage T4, Nature 227, 880-685.-   Larsen, F., Gundersen, G., Lopez, R. and Prydz, H. (1992) CpG    islands as to gene markers in the human genome, Genomics 13,    1095-1107.-   U, Q. L., Kim, H. R., Kim, W. J., Choi, J. K., Lee, Y. H., Kim, H.    M., U, L S., Kim, H., Chang, J., Ito, Y., Youl, L. K. and    Bae, S. C. (2004) Transcriptional silencing of the RUNX3 gene by CpG    hypermethylation is associated with lung cancer, Biochem. Biophys.    Res. Commun. 314, 223-228.-   Nan, X., Ng, H. H., Johnson, C. A., Laherty, C. D., Turner, B. M.,    Eisenman, R. N. and Bird. A. (1998) Transcriptional repression by    the methyl-CpG-binding protein MeCP2 Involves a histone deacetylase    complex, Nature 393, 386-389.-   Shen, J. C., Rideout, W. M., Ill and Jones, P. A. (1994) The rate of    hydrolytic deamination of 5-methylcytosine in double-stranded DNA,    Nucleic Acids Res. 22, 972-976.-   Sved, J. and Bird, A. (1990) The expected equilibrium of the CpG    dinucleotide in vertebrate genomes under a mutation model, Proc.    Natl. Acad. Sci. U.S.A 87, 4692-4696.-   Takai, D. and Jones, P. A. (2002) Comprehensive analysis of CpG    islands in human chromosomes 21 and 22. Proc. Natl. Acad. Sci. U.S.A    99, 3740-3745.-   Voo, K. S., Carlon, D. L, Jacobsen, B. U., Flodin, A., und    Skalink, D. (2000) Cloning of a Mammalian Transcriptional Activator    That Binds Unmethylated CpG motifs and Shares a CXXC Domain with DNA    Mathyltransferase, Human Trithorax, and Methyl-CpG Binding Domain    Protein i, Mol. And Cell. Biol. Mar. 2000, 2108-2121.-   Wade, P. A., Gegonne, A., Jones, P. L, Ballestar, E, Aubry. F, and    Wolfe, A. P. (1999) MI-2 complex couples DNA methylation to    chromatin remodeling and histone deacetylation, Nat. Genet. 23,    62-86.-   Wise, T. L and Pravtcheva, D. D. (1999) The undermethylated state of    a CpG island region in igf2 transgenes Is dependent on the H19    enhancers, Genomics 60, 258-271.-   Wolf, H., Modrow, 8., Soutschek, E., Motz, M., Grunow, R. and    Döbl, H. is (1990) Production, mapping and biological    characterisation of monoclonal antibodies to the core protein (p24)    of the human immunodeficiency virus type 1., AIFO 1, 24-29.-   Wu, Q., Sh, H., Suo, Z. and Nesland, J. M. (2003) 5-CpG Island    methylation of the FHIT gene is associated with reduced protein    expression and higher clinical stage in cervical carcinomas,    Ubtrastruct. Pathol. 27, 417-422.-   Yao, X., Hu, J. F., Daniels, M., Shiran, H., Zhou, X., Yan, H., Lu,    H., Zeng, Z., Wang, Q., U, T. and Hoffman, A. R. (2003) A methylated    oligonucleotide inhibits IGF2 expression and enhances survival in a    model of hepatocellular carcinoma, J. Clin. Invest 111, 265-273.-   Yoshida, M., Nosaka, K., Yasunaga, J. I., Nishikata, I.,    Morishite, K. and Matsuoka, M. (2003) Aberrant expression of the    MEL1S gene identified in 3 association with hypomethylation in adult    T-cell leukemia cells, Blood.

1. Method for the targeted modulation of the gene expression, comprisingthe steps: (i) Provision of a target nucleic acid sequence to beexpressed, (ii) Modification of the target nucleic acid sequence, inwhich the number of CpG dinucleotides present in the target nucleic acidsequence is raised using the degeneracy of the genetic code to increasethe gene expression, or is lowered to reduce the gene expression, (iii)Cloning of the thereby modified target nucleic acid sequence with amodified number of the CpG dinucleotides in a suitable expression vectorin operative coupling with a suitable transcription control sequence,(iv) Expression of the modified target nucleic acid sequence in asuitable expression system.
 2. Method according to claim 1, in which instep (ii) the modification of the target nucleic acid sequence iscarried out so that, in addition to increasing or reducing the number ofCpG dinucleotides, one or more additional modifications is/are carriedout at the nucleic acid level.
 3. (canceled)
 4. Method according toclaim 1, in which the modification of the target nucleic acid sequenceby increasing or reducing the number of CpG dinucleotides is carried outhaving regard to a codon choice optimised for the expression system. 5.(canceled)
 6. Method according to claim 1, in which the gene expressionis raised.
 7. Method according to claim 1, in which the gene expressionis reduced.
 8. Method according to claim 1, in which the target nucleicacid sequence to be expressed is heterologous to the expression system.9. Method according to claim 1, in which a eukaryotic or prokaryoticexpression system is used as the expression system. 10.-12. (canceled)13. Method according to claim 1, in which the modified target nucleicacid sequence and the transcription control sequence are not associatedwith CpG islands.
 14. Method according to claim 1, in which the numberof CpG dinucleotides is increased or reduced by at least two.
 15. Methodaccording to claim 1, in which the number of CpG dinucleotides isincreased or reduced by at least 10%, preferably at least 50%, morepreferably at least 100%.
 16. Method according to claim 1, in which allCpG dinucleotides are removed using the degeneracy of the genetic code.17.-20. (canceled)
 21. Method according to claim 1, in which the targetnucleic acid sequence codes for a functional RNA.
 22. (canceled) 23.Modified nucleic acid with a region capable of transcription that can beexpressed in an expression system, and which is derived from a wild-typesequence, in which the region capable of transcription is modified sothat it is codon-optimised in relation to the employed expressionsystem, and so that the number of CpG dinucleotides is increasedcompared to the codon-optimised sequence derived from the wild-typesequence, by using the degeneracy of the genetic code.
 24. Nucleic acidaccording to claim 23, in which the number of CpG dinucleotides isincreased compared to the wild-type sequence by at least 10%, preferablyat least 25%, more preferably at least 50%, particularly preferably atleast 100%, more particularly preferably at least 200%, especially by afactor of 5 and most especially by a factor of 10 or more. 25.-26.(canceled)
 27. Vector comprising a nucleic acid according to claim 23 inoperative coupling with a suitable transcription control sequence. 28.Vector according to claim 27, in which the transcription controlsequence comprises a promoter. 29.-36. (canceled)
 37. Cell containing anucleic acid or a vector according to claim
 23. 38. Expression systemcomprising a) a modified nucleic acid sequence with a region capable oftranscription, which is derived from a wild-type sequence, wherein themodified nucleic acid sequence has an increased or reduced number of CpGdinucleotides compared to the wild-type sequence, in operative couplingwith a transcription control sequence, and b) an expression environmentselected from a cell and a cell-free expression environment wherein a)can be expressed, in which the expression system in the case ofexpression of a modified nucleic acid sequence with an increased numberof CpG dinucleotides exhibits an increased expression, and in the caseof expression of a modified nucleic acid sequence with a reduced numberof CpG dinucleotides exhibits a reduced expression.
 39. (canceled) 40.Use of a nucleic acid and/or a vector and/or a cell and/or an expressionsystem according to claim 23 for the production of a medicament for adiagnostic and/or therapeutic treatment. 41.-42. (canceled)